Logging interface by lkdvos · Pull Request #7 · JuliaManifolds/AlgorithmsInterface.jl

lkdvos · 2025-10-16T22:44:34Z

Very WIP and rough start of getting this on the road. I've basically spent the entire day toying around with a bunch of different strategies and approaches, based on as many different frameworks as I could find, and wasn't really happy with most of them.
I think I'm starting to settle on some parts I like, but this still needs a lot of work.

(this is not ready for review, just wanted to post some progress here to let you know that I am actually looking into it)

Some comments on what I have:

I somehow wanted to replace debug with logging, since I get confused about the meaning of the word debug, which for me means figuring out things that arent working. Happy to change that if this is too radical, just a suggestion.
Lots and lots of bikeshedding about names
The logging system as a whole is painful because there are too many ways that all look very equivalent to achieve the same thing.

Some vague goals:

separate out generating log records and processing them, in order to unify printing and recording data
have a system which is extensible, so a specific algorithm may introduce new logging events
have a default system which works in most of the cases
have a system which is compatible with subsolvers

I've started by migrating some of the Manopt debug things. I am for now trying to avoid having to wrap many things, although I might go back on this. Having problem, alg, state also makes it not an obvious choice what to wrap.

Notably missing from the things I've started porting is the logic for deciding when to print something. I have some vague intuition/notion that this should be handled by the logger, although I have not figured out a satisfying way to achieve this.

Also missing is the formatting parts. In order to unify logging and recording, the default implementation does not return strings, rather it produces data. I'm hoping to be able to tie unique id's to these logs, which in turn the logger could use to format into a string if it wants to print, or to push to a vector of data if it wants to record.

I am probably going to try use LoggingExtras.jl to do some of the routing, ie having a logger that splits to both a file and an io and a recording device sounds important

The factories probably will have to return both a logger to handle the events, as well as a set of action to generate the events.

all of this is already sounding incredibly involved, and I'm slightly scared about overengineering it to a point where it becomes virtually unusable, but for now I'm hopeful and will at least continue a little bit to see how it shakes out. Take everything with a grain of salt, it might be necessary to discard everything in the end.

kellertuer · 2025-10-17T07:20:55Z

Using Logging instead of Debug in naming is totally fine. Debug might only be one reason one wants to see prints :)

kellertuer · 2025-10-17T07:28:19Z

My system also only very slowly evolved into the current state, so I do see that starting that anew from scratch might look like quite some work. I can take a look at the sub solver ideas once you have settled your ideas a bit as well as the factory.

About the “what-to-warp” I think that is a state-property to know wether to log or not; neither problem nor algorithm are changed.

The Manopt subsolver debug there is the “sending side”

https://github.com/JuliaManifolds/Manopt.jl/blob/392ee0e31e61beeffcca1e27a71796d71e6e7ddd/src/plans/debug.jl#L196-L202

as well as the “receiving side”

https://github.com/JuliaManifolds/Manopt.jl/blob/392ee0e31e61beeffcca1e27a71796d71e6e7ddd/src/plans/debug.jl#L841

and sure that wraps a certain debugger inside. - and the communicate via the set_parameter! mechanism I used in several areas of Manopt, since that is even designed to remember a chain where to pass things to (here the substates debug).

lkdvos · 2025-11-05T00:16:58Z

So, while it took me some time to actually write some updates, this has actually been in the back of my mind for quite some time.
The main issue was that I couldn't really find a solution that got all of my goals in a satisfactory way, but I think I finally found something I quite like.

One of the main reasons I was drawn to the logging system is that it does not require any wrapping, and I was hoping to find a way to not have to pollute the actual algorithms too much.
However, I kept getting annoyed by some pretty fundamental problems with that approach:

What do I do with logging events that are not from an algorithm
How do I have no (or very little) overhead by default
The first issue could be solved by manually wrapping a parent logger, and then at every algorithm start using a with_logger approach, which then filters out the correct messages and dispatches the algorithm events correctly.
I tried this out, but didn't really like the end result.
The fact that this basically requires that you always wrap the algorithm, or deal with the fact that you get log messages into the logging pipeline that are "unhandled" really felt wrong.
The same can be said for that second issue, where it is easy to turn off the logging with very little overhead, but this is not the case by default.
An alternative solution I played around with a bit was to emit custom LogEvent structs, and attempt to use dispatch in combination with the logging system to handle this.
Unfortunately, the logging system is pretty much designed to not specialize as much as possible, and I couldn't find a way to not run into method ambiguities.
While this could be manually patched for the base loggers, every single new logger would have the same issue, so this was also off the table.

What kept me kind of fixated on this is that it felt like it really almost works the way I wanted it to, so finally I bit the bullet and realized I can just partially rewrite the entire logging system myself, copying what I like and not copying what I don't.
So here is my new attempt, which is in spirit very similar to Manopt, with the primary difference being that I used a scoped value.
The idea being that this should quite naturally be composable for subsolvers, having their own scope, doesn't interfere with other logging machinery, still has full flexibility, and the default case should have very little overhead.

I hope it is slightly clear what I'm trying to achieve, I still have some work cut out for things like factories, and properly writing some docs for this, but I wanted to already push this and perhaps get an opinion or two?

kellertuer · 2025-11-05T07:45:17Z

Overall the newest commit looks very nice. I already had one comment on the code, see above, that one could reduce overhead also by having a wrapper for the algorithm state, is that without that wrapper no logging functions are ever called. This is what I do with the DebugState in Manopt.

lkdvos · 2025-11-06T17:58:50Z

Hmmm, I would really like to avoid having to do additional wrapping, since the wrapping is already happening in the scoped value.
I did some minimal benchmarks, using the root-finding problem in our testsuite.
For a 1000 iterations of the newton method, which is realistically the worst-case scenario for something you would ever use an iterative solver for, as the computations are very cheap, I get these timings before and after:

julia> @benchmark solve($problem, $algorithm1)

BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):  128.583 μs …  6.155 ms  ┊ GC (min … max): 0.00% … 96.95%
 Time  (median):     142.584 μs              ┊ GC (median):    0.00%
 Time  (mean ± σ):   147.027 μs ± 97.708 μs  ┊ GC (mean ± σ):  2.41% ±  4.95%

              ▄▇▆▃▄▄▇█▅▂
  ▂▄▄▃▄▄▆▆▇▅▅███████████▇▇▇▆▅▅▅▅▄▄▄▄▃▃▂▃▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁ ▃
  129 μs          Histogram: frequency by time          175 μs <

 Memory estimate: 156.30 KiB, allocs estimate: 10002.

julia> @benchmark solve($problem, $algorithm1)
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):  164.042 μs …   6.906 ms  ┊ GC (min … max): 0.00% … 97.27%
 Time  (median):     174.250 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   179.881 μs ± 100.043 μs  ┊ GC (mean ± σ):  2.00% ±  4.83%

       ▁▅▇█▆▂▂▂▃▃▃
  ▁▁▂▃▆█████████████▇▇▆▆▇▇▆▆▅▅▄▃▃▃▃▂▃▃▂▂▂▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁ ▃
  164 μs           Histogram: frequency by time          207 μs <

 Memory estimate: 156.30 KiB, allocs estimate: 10002.

In other words, the overhead of the logging infrastructure when there is no logging happening is about 30-40 ns per iteration. I think I find this acceptable, but that's probably because of my use-cases far exceeding these values. I don't know what you think?

(as a sidenote, this looks like a lot of allocations are happening, which is definitely something to investigate)

kellertuer · 2025-11-06T18:08:50Z

Hm, my idea of the wrapper was that there is not even that code in the default case, but you might be right that this super simple newton is already an extreme case and the overhead does look ok, though we would always have case 2 in your scenario?
I do not have a super strong reason for the wrapper, just my default way to do stuff, so I am also maybe just used to that paradigm.
The plain iteration looks a bit nicer without the logs calls,...but again, not a strong push for that, just personal reference.

For the allocs, maybe just how Newton is implemented there.

codecov · 2025-11-06T18:11:21Z

Codecov Report

❌ Patch coverage is 95.74468% with 2 lines in your changes missing coverage. Please review.
⚠️ Please upload report for BASE (main@4a60118). Learn more about missing BASE report.

Files with missing lines	Patch %	Lines
src/logging.jl	97.56%	1 Missing ⚠️
src/stopping_criterion.jl	0.00%	1 Missing ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##             main       #7   +/-   ##
=======================================
  Coverage        ?   56.36%           
=======================================
  Files           ?        5           
  Lines           ?      220           
  Branches        ?        0           
=======================================
  Hits            ?      124           
  Misses          ?       96           
  Partials        ?        0

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

lkdvos · 2025-11-06T18:19:15Z

So, I've updated the default implementation, the overhead is now completely negligible and it looks like the compiler can union-split where it needs to. (There is some variability on the benchmarks depending on what my computer is doing at the same time I think)

This clearly has less overhead, but the iteration now looks a bit more complicated, and we lose the ability to enable or disable the logging after the solving has started, which was previously actually possible in concurrent environments.
I think I'm fine with either, I don't mind the additional 4 lines in our default solve! implementation, and I like that there is only a single entry point if I ever would overload something.

julia> @benchmark solve($problem, $algorithm1)
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):  134.250 μs …   7.939 ms  ┊ GC (min … max): 0.00% … 97.87%
 Time  (median):     149.375 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   155.963 μs ± 123.466 μs  ┊ GC (mean ± σ):  2.95% ±  4.94%

                 ▆█▃▁▁▁▁▁
  ▁▁▁▁▁▁▁▁▁▂▂▂▂▄█████████▇▇▇▆▆▄▄▃▃▂▂▂▂▂▃▄▂▂▂▂▂▁▁▁▁▁▁▁▁▁▂▂▂▁▁▁▁▁ ▃
  134 μs           Histogram: frequency by time          178 μs <

 Memory estimate: 156.30 KiB, allocs estimate: 10002.

julia> @benchmark solve($problem, $algorithm1)
BenchmarkTools.Trial: 10000 samples with 1 evaluation per sample.
 Range (min … max):  141.750 μs …   9.444 ms  ┊ GC (min … max): 0.00% … 98.15%
 Time  (median):     148.667 μs               ┊ GC (median):    0.00%
 Time  (mean ± σ):   155.532 μs ± 128.421 μs  ┊ GC (mean ± σ):  2.92% ±  4.94%

     ▃██▃  ▁▁▁▁
  ▁▂▅████████████▇▆▆▅▄▃▃▃▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▃▅▄▂▁▂▂▂▁▁▁▁▁▁▁▁▁▁ ▃
  142 μs           Histogram: frequency by time          180 μs <

 Memory estimate: 156.30 KiB, allocs estimate: 10002.

kellertuer · 2025-11-06T18:34:20Z

Oh, the argument to interactively add logs sounds like an argument for your approach. In my wrap idea one would have to stop, wrap, and restart-solve which might recompile a few things.

I have not checked what LOGGING_ENABLED is but the current solution looks like a good mix of both, while keeping your flexibility during the iterations. So lets keep it like it is now

kellertuer · 2025-11-06T18:35:36Z

also good it is no longer log!, both because we have that function on manifolds and the new name sounds a bit better :)

edit: checked the overall PR and understand the variables now, neat :)

lkdvos · 2025-11-06T18:43:05Z

Ok, let me try and get some tests, some factories and the docs in, shoud make things more clear too

kellertuer · 2025-11-07T06:02:19Z

The extended docs look very neat! I think for a few of the actions and stopping criteria I would also like to work on a “convenience layer”, that you can just say something like debug = [:Time,] or so to generate the three actions to measure time (same for the stopping criteria) – so having some factories for these so the end user has to think a bit less :).
But I like the fine granularity one could do by hand every time as well.

kellertuer · 2025-11-07T08:01:27Z

The docs are indeed very nice.

I have a comment on indicates_conovergence: In Manopt I have a second function by now, that sounds a bit less formal: has_converges. The difference is as follows (and Herre even nicer to divide):

indicate convergence is the static state, whether a stopping criterion would always stop due to a convergence certificate.
i.e. stopping after an iteration would never indicate that, stop when stale would. their or would not, their and would
has converged would be the dynamic thing, for an actual algorithm that ran. Then the first thing is that it actually has finished. Afterwards the difference would be for the or case: Depending on which one “fired”, it would return true even (if stale has indicated)

One could also try to interface some status report to some enums - but that is a bit tricky maybe with the or/_and_s ;)

I can work on this after this PR when I tackle a stripping criterion factory (to handle a few kwargs automatically like maxiter or maxtime) and turn them into stopping criteria

lkdvos · 2025-11-07T17:00:10Z

https://juliamanifolds.github.io/AlgorithmsInterface.jl/previews/PR7

kellertuer · 2025-11-07T17:09:23Z

Nice!

You have a good tendency to confuse me with function names: log! is the in-place logarithmic map and you used that for a while, now there is also GroupAction, cf. https://juliamanifolds.github.io/LieGroups.jl/stable/interface/actions/#LieGroups.GroupAction. Yeah its is slowly getting late here - so besides naming - these are great tutorials!

They go in full detail with all technical stuff necessary. I hope (aas mentioned) to do some them on a convenience layer a bit shorter / more accessible, but that would then be code/part of a tutorial before (with a bit less technical detail). Great work!

lkdvos · 2025-11-07T17:30:44Z

The best part is that I actually realized this the moment I wrote it down, that group action has a very different meaning 🙃 I'm happy to change that though, just wanted to get everything more or less in and stable. (Also, I realized that in the current implementation, we could even have with_algorithmlogger(:PreStep => action1, :PreStep => action2) to automatically do this, or again use & or some other syntax, in which case we wouldn't even need to export this :)

I think I would propose the following plan:

I'm finalizing this PR by adding some tests, after which I would like to make some final changes based on your feedback, surrounding naming, wrapping, wording etc.
I 200% agree that the current actions are too bare bones, and we can and should add a number of convenience builtins that make life a lot easier. However, in an attempt to make this PR not too big, I'd maybe save that for a follow-up, to make reviewing and seeing what happens easier?
Writing the example for the docs has made me realize that there are some minor points in the interface that we might be able to improve upon. Since we aren't at a release stage though, I'd also like to bring that up in a separate issue, after getting this in.
I'm not sure I understood your comments about indicates_convergence, and the interplay between stopping criteria and logging is still a bit fuzzy to me. I'd love to formalize that part a bit more, since I'd argue that this is definitely an important aspect, but it might again be easier to reason about this after already merging this PR.

kellertuer · 2025-11-07T19:32:20Z

since GroupAction is an action on a group and here we have a group of actions, what to you think about ActionGroup ?

I like the idea with the pairs :Context => Action that looks nice. And within a context building a group, & is ok, but we could also use some other binary operator is we have one that seems fitting?

Your next steps sound great!

For the convergence, maybe have the following in mind that is even easier to split here (and naming of functions can surely be improved)

indicates_convergence(sc) is more a mathematical and static property to ask. So this would here only depend on a StoppingCriterion and state, whether (if this indicates to stop) it indicates we converged to something. StopAfterIteration(10) does not indicate convergence but a StopWhenGradientNormLess indicates convergence (at least to a critical point). This is static - no algorithm needs to run for this.
has_converged(sc, sc_state) is a thing depending on the StoppingCriterionState and could in for the simple cases like StopWhenGradientLess just be indicates_convergence(sc) && is_finihsed(...)

For the StopWhenAny the difference is very subtle. As a static guarantee (does it always indicate convergence?) all its components have to answer true for the. For has_convergeed on the other hand, among those that indicated is_finished, one has have converged.

lkdvos · 2025-11-08T12:57:35Z

Ah, I think I'm following what you mean with the distinction now, that does make sense indeed.

kellertuer

Though there is one minor remark I have for the docs, I already mark it as approved, since that is not an urgent one.

Thanks for the detailed technical docs. It helped me to see how nice the Logging ended up being and to also see a nice explanation for the stopping criteria.

Co-authored-by: Ronny Bergmann <git@ronnybergmann.net>

lkdvos added 3 commits October 16, 2025 18:30

add logging file

11facf1

add logging statements

2d740b8

some progress

81c23a1

kellertuer reviewed Oct 17, 2025

View reviewed changes

Comment thread src/interface/interface.jl Outdated

some actual progress

e11636f

lkdvos force-pushed the ld-debug branch from aadbbff to e11636f Compare November 4, 2025 23:07

small fixes

0b7ea5d

fix Printf compat

ad95017

minor optimizations

195a9c7

one more optimization attempt

2e9d3df

lkdvos added 4 commits November 6, 2025 20:46

various improvements and rework interface.md

b9c9b26

small docstring typos

e293ccd

stopping criterion docs and small fixes

0eb141a

WIP

8d5f452

kellertuer reviewed Nov 7, 2025

View reviewed changes

Comment thread docs/src/logging.md Outdated

lkdvos added 2 commits November 7, 2025 08:48

collapse docstrings by default

c40765e

write logging docs

464882c

lkdvos added 2 commits November 7, 2025 11:44

clean up and finish logging docs

d285b04

fix test

ba8739a

lkdvos added 4 commits November 7, 2025 12:55

add iteration test

78006a1

add error test

e4fe20e

Add test for IfAction

51f7a68

Add GroupAction test

c356152

lkdvos changed the title ~~[WIP] Logging interface~~ Logging interface Nov 7, 2025

add global toggle test

bd1d900

GroupAction -> ActionGroup

703284a

kellertuer approved these changes Nov 8, 2025

View reviewed changes

Comment thread docs/src/interface.md Outdated

Update interface.md

042a369

Co-authored-by: Ronny Bergmann <git@ronnybergmann.net>

lkdvos merged commit c0ad8e3 into main Nov 8, 2025
11 checks passed

lkdvos deleted the ld-debug branch November 8, 2025 18:50

Conversation

lkdvos commented Oct 16, 2025

Uh oh!

kellertuer commented Oct 17, 2025

Uh oh!

kellertuer commented Oct 17, 2025

Uh oh!

Uh oh!

lkdvos commented Nov 5, 2025

Uh oh!

kellertuer commented Nov 5, 2025

Uh oh!

lkdvos commented Nov 6, 2025

Uh oh!

kellertuer commented Nov 6, 2025

Uh oh!

codecov Bot commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

lkdvos commented Nov 6, 2025

Uh oh!

kellertuer commented Nov 6, 2025

Uh oh!

kellertuer commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lkdvos commented Nov 6, 2025

Uh oh!

kellertuer commented Nov 7, 2025

Uh oh!

kellertuer commented Nov 7, 2025

Uh oh!

Uh oh!

lkdvos commented Nov 7, 2025

Uh oh!

kellertuer commented Nov 7, 2025

Uh oh!

lkdvos commented Nov 7, 2025

Uh oh!

kellertuer commented Nov 7, 2025

Uh oh!

lkdvos commented Nov 8, 2025

Uh oh!

kellertuer left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov Bot commented Nov 6, 2025 •

edited

Loading

kellertuer commented Nov 6, 2025 •

edited

Loading