Skip to content

Avoid two-layer Jacobian concatenation for non-batchable backends #874

@gdalle

Description

@gdalle

As evidenced by the benchmarks below, mapreduce(hcat, ...) has serious allocation overhead. Replacing it with stack when the batch size is 1 could be beneficial.

Details
using BenchmarkTools
using DifferentiationInterface
using Mooncake: Mooncake
using ForwardDiff: ForwardDiff

f(x) = map(cos, x);
x = ones(1000);
J = similar(x, length(x), length(x));

prep_mooncake_forward = prepare_jacobian(f, AutoMooncakeForward(), x);
prep_forwarddiff1 = prepare_jacobian(f, AutoForwardDiff(; chunksize=1), x);

@btime jacobian($f, $prep_mooncake_forward, AutoMooncakeForward(), $x);  # 220 ms
@btime jacobian($f, $prep_forwarddiff1, AutoForwardDiff(; chunksize=1), $x);  # 8 ms

@btime jacobian!($f, $J, $prep_mooncake_forward, AutoMooncakeForward(), $x);  # 9 ms
@btime jacobian!($f, $J, $prep_forwarddiff1, AutoForwardDiff(; chunksize=1), $x);  # 8 ms

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions