Add benchmark tutorial (#139)

gdalle · web-flow · commit 8a29b58ebb9c · 2024-04-05T09:25:06.000+02:00
diff --git a/DifferentiationInterface/README.md b/DifferentiationInterface/README.md
@@ -46,6 +46,16 @@ We also provide some experimental backends ourselves:
 | [FastDifferentiation.jl](https://github.com/brianguenter/FastDifferentiation.jl) | `AutoFastDifferentiation()`, `AutoSparseFastDifferentiation()` |
 | [Tapir.jl](https://github.com/withbayes/Tapir.jl)                                | `AutoTapir()`                                                  |
 
+## Installation
+
+In a Julia REPL, run
+
+```julia
+julia> using Pkg
+
+julia> Pkg.add(url="https://github.com/gdalle/DifferentiationInterface.jl", subdir="DifferentiationInterface")
+```
+
 ## Example
 
 ```jldoctest readme
diff --git a/DifferentiationInterface/docs/src/tutorial.md b/DifferentiationInterface/docs/src/tutorial.md
@@ -24,25 +24,25 @@ f(x::AbstractArray) = sum(abs2, x)
 and a random input vector
 
 ```@repl tuto
-x = [1.0, 2.0, 3.0]
+x = [1.0, 2.0, 3.0];
 ```
 
 To compute its gradient, we need to choose a "backend", i.e. an AD package that DifferentiationInterface.jl will call under the hood.
-[ForwardDiff.jl](https://github.com/JuliaDiff/ForwardDiff.jl) is very efficient for low-dimensional inputs, so we'll go with that one.
-Most backend types are defined by [ADTypes.jl](https://github.com/SciML/ADTypes.jl) and re-exported by DifferentiationInterface.jl:
+Most backend types are defined by [ADTypes.jl](https://github.com/SciML/ADTypes.jl) and re-exported by DifferentiationInterface.jl.
+[ForwardDiff.jl](https://github.com/JuliaDiff/ForwardDiff.jl) is very generic and efficient for low-dimensional inputs, so it's a good starting point:
 
 ```@repl tuto
 backend = AutoForwardDiff()
 ```
 
-Now we can use DifferentiationInterface.jl to get our gradient:
+Now you can use DifferentiationInterface.jl to get the gradient:
 
 ```@repl tuto
 gradient(f, backend, x)
 ```
 
 Was that fast?
-We can use [BenchmarkTools.jl](https://github.com/JuliaCI/BenchmarkTools.jl) to answer that question.
+[BenchmarkTools.jl](https://github.com/JuliaCI/BenchmarkTools.jl) helps you answer that question.
 
 ```@repl tuto
 @btime gradient($f, $backend, $x);
@@ -54,12 +54,12 @@ More or less what you would get if you just used the API from ForwardDiff.jl:
 @btime ForwardDiff.gradient($f, $x);
 ```
 
-Not bad, but we can do better.
+Not bad, but you can do better.
 
 ## Overwriting a gradient
 
-Since we know how much space our gradient will occupy, we can pre-allocate that memory and offer it to AD.
-Some backends can get a speed boost from this trick.
+Since you know how much space your gradient will occupy, you can pre-allocate that memory and offer it to AD.
+Some backends get a speed boost from this trick.
 
 ```@repl tuto
 grad = zero(x)
@@ -72,9 +72,9 @@ Note the double exclamation mark, which is a convention telling you that `grad`
 @btime gradient!!($f, _grad, $backend, $x) evals=1 setup=(_grad=similar($x));
 ```
 
-For some reason the in-place version is not much better than our first attempt.
-However, as you can see, it has one less allocation: it corresponds to the gradient vector we provided.
-Don't worry, we're not done yet.
+For some reason the in-place version is not much better than your first attempt.
+However, it has one less allocation, which corresponds to the gradient vector you provided.
+Don't worry, you're not done yet.
 
 ## Preparing for multiple gradients
 
@@ -86,15 +86,14 @@ We abstract away the preparation step behind a backend-agnostic syntax:
 extras = prepare_gradient(f, backend, x)
 ```
 
-You don't need to know what that is, you just need to pass it to the gradient operator.
+You don't need to know what this object is, you just need to pass it to the gradient operator.
 
 ```@repl tuto
 grad = zero(x);
 grad = gradient!!(f, grad, backend, x, extras)
 ```
 
-Why, you ask?
-Because it is much faster, and allocation-free.
+Preparation makes the gradient computation much faster, and (in this case) allocation-free.
 
 ```@repl tuto
 @btime gradient!!($f, _grad, $backend, $x, _extras) evals=1 setup=(
@@ -105,7 +104,7 @@ Because it is much faster, and allocation-free.
 
 ## Switching backends
 
-Now the whole point of DifferentiationInterface.jl is that you can easily experiment with different AD solutions.
+The whole point of DifferentiationInterface.jl is that you can easily experiment with different AD solutions.
 Typically, for gradients, reverse mode AD might be a better fit.
 So let's try the state-of-the-art [Enzyme.jl](https://github.com/EnzymeAD/Enzyme.jl)!
 
@@ -121,7 +120,7 @@ But once it is done, things run smoothly with exactly the same syntax:
 gradient(f, backend2, x)
 ```
 
-And we can run the same benchmarks:
+And you can run the same benchmarks:
 
 ```@repl tuto
 @btime gradient!!($f, _grad, $backend2, $x, _extras) evals=1 setup=(
@@ -130,7 +129,6 @@ And we can run the same benchmarks:
 );
 ```
 
-Have you seen this?
-It's blazingly fast.
-And you know what's even better?
-You didn't need to look at the docs of either ForwardDiff.jl or Enzyme.jl to achieve top performance with both, or to compare them.
+Not only is it blazingly fast, you achieved this speedup without looking at the docs of either ForwardDiff.jl or Enzyme.jl!
+In short, DifferentiationInterface.jl allows for easy testing and comparison of AD backends.
+If you want to go further, check out the [DifferentiationTest.jl tutorial](https://gdalle.github.io/DifferentiationInterface.jl/DifferentiationInterfaceTest/dev/tutorial/).
diff --git a/DifferentiationInterfaceTest/README.md b/DifferentiationInterfaceTest/README.md
@@ -20,4 +20,14 @@ Make it easy to know, for a given function:
 - Type stability tests
 - Count calls to the function
 - Benchmark runtime and allocations
-- Weird array types (GPU, static, components)
+- Weird array types (GPU, static, components)
+
+## Installation
+
+In a Julia REPL, run
+
+```julia
+julia> using Pkg
+
+julia> Pkg.add(url="https://github.com/gdalle/DifferentiationInterface.jl", subdir="DifferentiationInterfaceTest")
+```
diff --git a/DifferentiationInterfaceTest/docs/Project.toml b/DifferentiationInterfaceTest/docs/Project.toml
@@ -4,4 +4,7 @@ DataFrames = "a93c6f00-e57d-5684-b7b6-d8193f3e46c0"
 DifferentiationInterface = "a0c0ee7d-e4b9-4e03-894e-1c5f64a51d63"
 DifferentiationInterfaceTest = "a82114a7-5aa3-49a8-9643-716bb13727a3"
 Documenter = "e30172f5-a6a5-5a46-863b-614d45cd2de4"
+Enzyme = "7da242da-08ed-463a-9acd-ee780be4f1d9"
 ForwardDiff = "f6369f11-7733-5829-9624-2563aa707210"
+Markdown = "d6f4376e-aef5-505a-96c1-9c027394607a"
+PrettyTables = "08abe8d2-0d0c-5749-adfa-8a2ac140af0d"
diff --git a/DifferentiationInterfaceTest/docs/make.jl b/DifferentiationInterfaceTest/docs/make.jl
@@ -15,6 +15,7 @@ makedocs(;
     format=Documenter.HTML(),
     pages=[
         "Home" => "index.md", #
+        "Tutorial" => "tutorial.md", #
         "API reference" => "api.md",
     ],
 )
diff --git a/DifferentiationInterfaceTest/docs/src/tutorial.md b/DifferentiationInterfaceTest/docs/src/tutorial.md
@@ -0,0 +1,90 @@
+```@meta
+CurrentModule = Main
+```
+
+# Tutorial
+
+We present a typical workflow with DifferentiationInterfaceTest.jl, building on the [DifferentiationInterface.jl tutorial](https://gdalle.github.io/DifferentiationInterface.jl/DifferentiationInterface/dev/tutorial/) (which we encourage you to read first).
+
+```@repl tuto
+using DifferentiationInterface, DifferentiationInterfaceTest
+import ForwardDiff, Enzyme
+import DataFrames
+```
+
+## Introduction
+
+The AD backends we want to compare are [ForwardDiff.jl](https://github.com/JuliaDiff/ForwardDiff.jl) and [Enzyme.jl](https://github.com/EnzymeAD/Enzyme.jl).
+
+```@repl tuto
+backends = [AutoForwardDiff(), AutoEnzyme(Enzyme.Reverse)]
+```
+
+To do that, we are going to take gradients of a simple function:
+
+```@repl tuto
+f(x::AbstractArray) = sum(sin, x)
+```
+
+Of course we know the true gradient mapping:
+
+```@repl tuto
+∇f(x::AbstractArray) = cos.(x)
+```
+
+DifferentiationInterfaceTest.jl relies with so-called "scenarios", in which you encapsulate the information needed for your test:
+
+- the function `f`
+- the input `x` (and output `y` for mutating functions)
+- optionally a reference `ref` to check against
+
+There is one scenario per operator, and so here we will use [`GradientScenario`](@ref).
+Let us experiment with various input types and sizes:
+
+```@repl tuto
+scenarios = [
+    GradientScenario(f; x=rand(Float64, 3), ref=∇f),
+    GradientScenario(f; x=rand(Float32, 3, 4), ref=∇f),
+    GradientScenario(f; x=rand(Float16, 3, 4, 5), ref=∇f),
+];
+```
+
+## Testing
+
+The main entry point for testing is the function [`test_differentiation`](@ref).
+It has many options, but the main ingredients are the following:
+
+```@repl tuto
+test_differentiation(
+    backends,  # the backends you want to compare
+    scenarios,  # the scenarios you defined,
+    correctness=true,  # compares values against the reference
+    type_stability=true,  # checks type stability with JET.jl
+    detailed=true,  # prints a detailed test set
+)
+```
+
+If you are too lazy to manually specify the reference, you can also provide an AD backend as the `correctness` keyword argument, which will serve as the ground truth for comparison.
+
+## Benchmarking
+
+Once you are confident that your backends give the correct answers, you probably want to compare their performance.
+This is made easy by the [`benchmark_differentiation`](@ref) function, whose syntax should feel familiar:
+
+```@repl tuto
+benchmark_result = benchmark_differentiation(backends, scenarios);
+```
+
+The resulting object is a `Vector` of structs, which can easily be converted into a `DataFrame` from [DataFrames.jl](https://github.com/JuliaData/DataFrames.jl):
+
+```@repl tuto
+df = DataFrames.DataFrame(benchmark_result)
+```
+
+Here's what the resulting `DataFrame` looks like with all its columns.
+Note that we only compare (possibly) in-place operators, because they are always more efficient.
+
+```@example tuto
+import Markdown, PrettyTables  # hide
+Markdown.parse(PrettyTables.pretty_table(String, df; backend=Val(:markdown), header=names(df)))  # hide
+```
diff --git a/DifferentiationInterfaceTest/src/test_differentiation.jl b/DifferentiationInterfaceTest/src/test_differentiation.jl
@@ -69,12 +69,12 @@ function test_differentiation(
         excluded,
     )
 
-    title =
-        "Testing" *
-        (correctness != false ? " correctness" : "") *
-        (call_count ? " calls" : "") *
-        (type_stability ? " types" : "") *
-        (sparsity ? " sparsity" : "")
+    title_additions =
+        (correctness != false ? " + correctness" : "") *
+        (call_count ? " + calls" : "") *
+        (type_stability ? " + types" : "") *
+        (sparsity ? " + sparsity" : "")
+    title = "Testing" * title_additions[3:end]
 
     prog = ProgressUnknown(; desc="$title", spinner=true, enabled=logging)
 

Original file line number	Diff line number	Diff line change
`@@ -15,6 +15,7 @@ makedocs(;`
`15`	`15`	`format=Documenter.HTML(),`
`16`	`16`	`pages=[`
`17`	`17`	`"Home" => "index.md", #`
	`18`	`+ "Tutorial" => "tutorial.md", #`
`18`	`19`	`"API reference" => "api.md",`
`19`	`20`	`],`
`20`	`21`	`)`