Skip to content

AutoEnzyme tries to differentiate Boolean Mask #824

@mmikhasenko

Description

@mmikhasenko

The Enzyme was the last of AD that gave up on my problems among all ADs.
A bug is subtle, it seems only reproduced by combining two PartitionMasks

ERROR: InexactError: Bool(1.559401f0)
Stacktrace:
  [1] Bool
    @ ./float.jl:251 [inlined]
  [2] convert
    @ ./number.jl:7 [inlined]
  [3] _setindex_scalar!
    @ ~/.julia/juliaup/julia-1.11.5+0.aarch64.apple.darwin14/share/julia/stdlib/v1.11/SparseArrays/src/sparsematrix.jl:3183
  [4] setindex!
    @ ~/.julia/juliaup/julia-1.11.5+0.aarch64.apple.darwin14/share/julia/stdlib/v1.11/SparseArrays/src/sparsematrix.jl:3180 [inlined]
  [5] reverse
    @ ~/.julia/packages/Enzyme/9dy6G/src/internal_rules.jl:854 [inlined]
  [6] mul!
    @ ~/.julia/juliaup/julia-1.11.5+0.aarch64.apple.darwin14/share/julia/stdlib/v1.11/LinearAlgebra/src/matmul.jl:253 [inlined]
  [7] *
    @ ~/.julia/juliaup/julia-1.11.5+0.aarch64.apple.darwin14/share/julia/stdlib/v1.11/LinearAlgebra/src/matmul.jl:60 [inlined]
  [8] combine
    @ ~/Documents/ML.CAT/Bijectors.jl/src/bijectors/coupling.jl:125 [inlined]

pointing to the combine method in Bijectors.jl.

I did my best to simplify, the example that reproduce the error.

using Bijectors
using NormalizingFlows.Optimisers
using DifferentiationInterface
import Flux: Dense
import Enzyme

struct Trivial{T} <: Bijectors.Bijector
    params::AbstractVector{T}
end

Bijectors.transform(b::Trivial, x) = with_logabsdet_jacobian(b, x)[1]
Bijectors.logabsdetjac(b::Trivial, x) = with_logabsdet_jacobian(b, x)[2]
Bijectors.with_logabsdet_jacobian(b::Trivial, x::Real) = x, b.params[1]
Bijectors.with_logabsdet_jacobian(b::Trivial, x::AbstractVector) = x, b.params[1]
Bijectors.with_logabsdet_jacobian(ib::Inverse{<:Trivial}, y::Real) = y, -ib.orig.params[1]
Bijectors.with_logabsdet_jacobian(ib::Inverse{<:Trivial}, y::AbstractArray) = y, -ib.orig.params[1]

flow = let
    q₀ = Distributions.Product([Uniform(0, 1), Uniform(0, 1)])
    mask12 = Bijectors.PartitionMask(2, [1], [2])
    mask21 = Bijectors.PartitionMask(2, [2], [1])
    layers = [
        Coupling(Trivial  Dense(1, 2), mask12),
        Coupling(Trivial  Dense(1, 2), mask21),
    ]
    ts = reduce(, layers)
    transformed(q₀, ts)
end

let
    θ, re = Optimisers.destructure(flow)
    loss(θ) = -pdf(re(θ), [0.5, 0.5])
    loss(θ)
end # works

let # fails
    θ, re = Optimisers.destructure(flow)
    loss(θ) = -pdf(re(θ), [0.5, 0.5])

    ADbackend = AutoEnzyme(;
        mode = Enzyme.set_runtime_activity(Enzyme.Reverse),
        function_annotation = Enzyme.Const)

    prep = prepare_gradient(loss, ADbackend, θ)
    DifferentiationInterface.gradient(loss, prep, ADbackend, θ)
end

An observation: the code works with 1 layer, and breaks with two layers

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions