Skip to content

Reworking the API and Docs to merge in DynamicalSystems.jl#19

Open
Datseris wants to merge 19 commits intomainfrom
complexity_api_rework
Open

Reworking the API and Docs to merge in DynamicalSystems.jl#19
Datseris wants to merge 19 commits intomainfrom
complexity_api_rework

Conversation

@Datseris
Copy link
Copy Markdown
Member

High level description: rework API so that it directly inherits and expands the types and functions from ComplexityMeasures.jl. Also re-work the docs to have the style of DynamicalSystems.jl with a main Tutorial and then a formal API page that lists all docstrings.

@gabriel-ferr
Copy link
Copy Markdown
Member

Following the todo list, I'm renaming the microstates shape types, using something like: "Rect" -> "RectMicrostate", and also removing the recurrence expression dependence from them.

However, I don't understand the item "Furthermore, they should be orthogonal inputs to the main outcome space t" ...

@gabriel-ferr
Copy link
Copy Markdown
Member

gabriel-ferr commented Jan 31, 2026

I changed the histogram input, it now uses the RecurrenceMicrostate type instead of the core. I also implemented the interface with ComplexityMeasures, but I am not sure whether it is working correctly. When I tried to test it, I got the following error:
"probabilities(rmspace, X)
ERROR: MethodError: no method matching Probabilities(::Int64, ::Int64)
The type Probabilities exists, but no method is defined for this combination of argument types when trying to construct it."

About the outcomes, I used the indices given by "eachindex(counts)". I'm not sure if it is the best approach. Other alternatives could be matrix representations, but for spatial data this can be problematic, not to mention the memory allocation cost. Another option is the binary representation, but it can be harder to read. Since the decimal representation is already used for counting the microstates when computing the histogram, the indices of eachindex(counts) are also a representation of the microstates.

Another question: What do I do with the distribution functions? Delete them, or try to adapt them and mark them as deprecated?

@Datseris
Copy link
Copy Markdown
Member Author

Datseris commented Jan 31, 2026

when reporting error messages please put the whole stack trace (and make sure to wrap it in tripple back ticks so it is formatted as code) . I don't know if your error is internal or just because you didn't pass the proper outcome space struct.

About the outcomes, I used the indices given by "eachindex(counts)". I'm not sure if it is the best approach. Other alternatives could be matrix representations, but for spatial data this can be problematic, not to mention the memory allocation cost.

This is something we can think at the end, it is not important right now. I assume that for a given N you have a unique and reversible way to encode each microstate to a unique integer. Correct?

@gabriel-ferr
Copy link
Copy Markdown
Member

Understood. I fixed it. I was returning the wrong value in the function ComplexityMeasures.counts_and_outcomes. Now it is working (I think... I'll test it more later).

This is something we can think at the end, it is not important right now. I assume that for a given N you have a unique and reversible way to encode each microstate to a unique integer. Correct?

Yes. A recurrence microstate can be interpreted as a binary number, since it is composed of 0s and 1s. When we define the microstate shape, we also define how to read this microstate as a binary number. Then, the package using the sampling process takes an initial position of the time series, $(i, j)$, and it computes the recurrence for each position starting from this initial position, following the offsets given by the function RecurrenceMicrostatesAnalysis.get_offsets(core::RMACore, shape::MicrostateShape).

Each of these recurrences are associated with a power of 2 from a power vector provided by the function RecurrenceMicrostatesAnalysis.get_power_vector(core::RMACore, shape::MicrostateShape). Using it, the package converts the binary representation of the microstate into its decimal representation while it is computing the recurrences =D , and this decimal representation is used as index to count the microstate.

And, of course, each index can be easily converted to its binary representation using digits(N - 1, base = 2). For example:

N = 55
digits(N - 1, base = 2)
0 1 1 0 1 1

For a 3x3 microstate, it is:

M = ⎡0  1  1⎤
    ⎢0  1  1⎥
    ⎣0  0  0⎦

Or for a triangle microstate with size 3 (the reading order is different):

M = ⎡0  1  0⎤
    ⎢   1  1⎥
    ⎣      1⎦

Note: N - 1 is because Julia works using 1-based indexing, but the binary ↔ decimal conversion works using 0-based.

@gabriel-ferr
Copy link
Copy Markdown
Member

I think it is working now. I had to add a probabilities overload to support CRP / two inputs, since it was throwing an error and I believe there is no such implementation in ComplexityMeasures.jl. So I added it to complexity_measures_interface:

function ComplexityMeasures.probabilities(o::RecurrenceMicrostates, x, y)
    return first(probabilities_and_outcomes(o, x, y))
end

function ComplexityMeasures.probabilities_and_outcomes(o::RecurrenceMicrostates, x, y)
    cts, outs = counts_and_outcomes(o, x, y)
    probs = Probabilities(cts, outs)
    return probs, outcomes(probs)
end

I also remove the RMACore from the user-facing API, it is now an internal structure that is automatically selected based on the input type =D

Then, the RecurrenceMicrostates struct now is:

struct RecurrenceMicrostates{MS <: MicrostateShape, RE <: RecurrenceExpression, SM <: SamplingMode} <: ComplexityMeasures.CountBasedOutcomeSpace 
    shape::MS
    expr::RE
    sampling::SM
end

@Datseris
Copy link
Copy Markdown
Member Author

Alright @gabriel-ferr I am back here. I've updated the structure of the documentation to reflect a more standardized setup where information across DynamicalSystems.jl shares the same strcuture and hence it is easier to access. Here is the plan I propose going forwards:

  1. Finish the main tutorial. Make sure all code in the tutorial runs.
  2. Make sure the docs build: the make.jl file runs without errors.
  3. At that point I'll give a second review to this PR. Tag me here with @Datseris when it is time.
  4. We update and reinform the interface according to my review if need be, but looking at your comments above, there doens't appear to be any need.
  5. Move all extra examples and extra code into the new Examples page
  6. List all docstrings clearly in the API.md page. There should be no explanations there, no additional text on how to e.g., specfy a [RecurrenceExpression`. All of this information must be inside docstrings.
  7. Update the integrations page (ML here).

Done!

@gabriel-ferr
Copy link
Copy Markdown
Member

gabriel-ferr commented Mar 23, 2026

Hello @Datseris 0/

I have two questions:


  1. About it:
# All of these quantities like laminarity are in fact _complexity measures_
# which is why RecurrenceMicrostateAnalysis.jl fits so well within the
# interface of ComplexityMeasures.jl.

Would it be a good idea to rewrite these quantifiers (DET, LAM, RR, and Disorder) as a ComplexityEstimator or a DiscreteInfoEstimator, and then use the complexity or information functions to compute them?

However, I'm not entirely sure where each of them fits 🙃. RR as information, and the others as complexity??

My intuition is that this could improve the integration between the two libraries and make it easier to use.

Edit: Currently RMA.jl implements a owner type QuantificationMeasure and computes it using a function measure, then I think that is easy to modify it 🙂


  1. I've been trying to implement another setting in the RecurrenceMicrostate struct. Right now, this structure answers three questions:
  • How to compute recurrence between two states of the input (e.g., typical threshold, corridor threshold, which metric to use, etc.)
  • What is the microstate shape and size (e.g., 2×2 rectangle, 3×3 rectangle, triangle, etc.)
  • How the microstates are sampled from the RP (e.g., random sampling, all overlapping microstates, etc.)

I was thinking about adding a fourth question:

  • From where should we extract the microstates? (e.g., from the entire RP, only below the LOI, etc.)

Would it be okay to include this in the current PR, or would it be better to finish this one and open a new PR later?


About it:

# TODO: The two numbers reported above are not the same.
# Perhaps the logarithm base is off?

Yes, internally optimize(Threshold(), RecurrenceEntropy(), X, N) uses base e, while entropy(Shannon(), rmspace, X) uses base 2. So we need to call entropy(Shannon(; base = MathConstants.e), rmspace, X) to make them consistent.

@Datseris
Copy link
Copy Markdown
Member Author

Question 1: yes.

How to make a distinction: quantities that can be estimated from probabilities are information measures. Quantities that must be estimated directly from input data are complexity measures. Perhaps you can write here a list of each measure and where you think it fits before committing to changing the code?

Question 2: better to do it in a new PR. You can open an issue (feature request) listing this for now so that you don't forget it.

Point 3: logarithm base. All default functions from ComplexityMeasures.jl use logarithm with base 2, so if we go ahead with the change outlined in question 1, the same will be done for all recurrence measures here. So in the end the two numbers will be the same, it will be taken care of internally.

@gabriel-ferr
Copy link
Copy Markdown
Member

Question 2: better to do it in a new PR. You can open an issue (feature request) listing this for now so that you don't forget it.

Understood, I'll open an issue 🙂

All default functions from ComplexityMeasures.jl use logarithm with base 2, so if we go ahead with the change outlined in question 1, the same will be done for all recurrence measures here. So in the end the two numbers will be the same, it will be taken care of internally.

Ok, I'll change all to use base 2.

Perhaps you can write here a list of each measure and where you think it fits before committing to changing the code?

Ok, let's try it.

  • Recurrence Rate: it can be estimate from any RMA distribution, so I think it fits better as an information measure.
  • Determinism and laminarity: it also is estimated from an RMA distribution, but this distribution is very specific. You need to use a square 3x3 microstate, or a diagonal (line) microstate with size 3 (this second option is faster, but you need to compute two distributions if you want to estimate both quantities). Then, I think it fits better as a complexity measure, because you cannot use any probability distribution, it must be specific.
  • Disorder: it also is compute using RMA distributions, but you need to compute several distributions and extract the disorder from each of them, so you find the maximum disorder from these results. Then, it probably is a complexity measures too.
  • Partial disorder: I add it here, because you can also compute "disorder" for a probability distribution, or for just one class of microstates. In this case, it is an information measure. You can compute it using:
measure(settings::Disorder{N}, probs::Probabilities, norm_param::Int) # for a distribution 
measure(settings::Disorder{N}, class::Int, probs::Probabilities) # for a specific class

Really, classify them is a little strange ... but probably:

RecurrenceRate <: DiscreteInfoEstimator
RecurrenceDeterminism <: ComplexityEstimator
RecurrenceLaminarity <: ComplexityEstimator
Disorder <: ComplexityEstimator
PartialDisorder <: DiscreteInfoEstimator
PartialClassDisorder <: DiscreteInfoEstimator

@Datseris
Copy link
Copy Markdown
Member Author

Recurrence Rate: it can be estimate from any RMA distribution, so I think it fits better as an information measure.

This is a common missconception. What you wrote above means that Recurrence rate is not an information measure. Because it cannot be estimated from any probability distribution. It has to be the probabilities of the RMA distribution.

Think of the Shannon entropy: doesn't matter Where the probabilities come from, their Shannon entropy is the same. Now what about permutation entropy? It is the shannon entropy if probabilities of the ordinal pattern distributions. As such, permutation entropy is not an information measure, it is a complexity measure.

So, in summary, I can see now that all recurrence measures are complexity measures and not information measures!

The next question is: is there any overlap in the estimation of these measures? Do you have to estimate first same things (same probabilities) for many various measures such as recurrence rate or determinism?

@gabriel-ferr
Copy link
Copy Markdown
Member

gabriel-ferr commented Mar 24, 2026

Oh, fine, I understand now.

The next question is: is there any overlap in the estimation of these measures? Do you have to estimate first same things (same probabilities) for many various measures such as recurrence rate or determinism?

Well, recurrence entropy and recurrence rate can be estimated from any recurrence distribution — e.g. if you compute a recurrence distribution for microstates $1\times1$, it will result in two probabilities: for 0 and for 1, and the probability for 1 is the recurrence rate 🙂 (for 0, do we have the non-recurrence rate?). But you can also take it as a mean from the recurrence rate of each microstate, then you can estimate it using any distribution.

For determinism and laminarity you need to use a square $3\times3$ microstate, but it is also possible to use line or diagonal microstates (keeping the length 3) to speedup it. In this second case it is necessary to compute both individually.

Disorder only works for square microstates. Usually $2 \times 2$ has a very poor resolution, then we typically are using $3 \times 3$ or $4 \times 4$. I used bitwise permutations to compute the labels for $5 \times 5$, then it is also possible.

Then, you don't need to estimate same probabilities for all of these things, and in some situations you cannot do it. But if you want, it is possible (e.g. always using $3\times3$ 🙂)

@Datseris
Copy link
Copy Markdown
Member Author

Datseris commented Mar 24, 2026

Okay I think there is two things we should do:

  1. Make a ComplexityEstimator for each one of the quantities (5 in total as far as I understand).
  2. Make a new bundle function that computes all quantities at the same time, using as little overlap of computation as possible, and returns a dictionary with the quantity name mapping to its value. This is similar with the rqa function. My understanding is that you can get the 3x3 or 4x4 (or nxn, n>2) microstates and from these you can compute all quantities? If that's the case this new function would have a single input n.

@gabriel-ferr
Copy link
Copy Markdown
Member

gabriel-ferr commented Apr 14, 2026

Hello @Datseris 0/

I think I’ve finished the documentation, but there are two things that still need to be fixed:

  1. I had to remove the display of the README.md in index.md because it includes a link from the previous repo that now results in a 404. This causes an error when Documenter.jl checks links, so I removed it to allow the docs to build. We can add it back once everything is working properly 🙂

  2. There is a small issue with the types Disorder <: ComplexityEstimator and WindowedDisorder <: ComplexityEstimator. Both have a field labels that contains the list indicating which class each microstate belongs to. The problem is that when we call something like:

disorder = Disorder()

it prints everything to the terminal... for $N = 4$, that means ~140 classes with 65,536 microstates distributed among them.

I overwrote the default display with:

function Base.show(io::IO, ::MIME"text/plain", x::Disorder)
  print(io, "TODO: Disorder")
end

and I did the same for WindowedDisorder, but I’m not sure how to configure it to behave similarly to the default in ComplexityMeasures.jl. It is in src/rqa/disorder.jl.

That’s it for now... I’ll fix the tests next 🙂

Edit: about the function similar to rqa, I implemented a function rma. It only takes a threshold and the input data, using $N = 3$ to compute everything (since DET and LAM can only be estimated for this value).

@Datseris
Copy link
Copy Markdown
Member Author

Fantastic Gabriel, I'll have a look!

Copy link
Copy Markdown
Member Author

@Datseris Datseris left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is all great work Gabriel. I only have minor comments. Once these are addressed this should be good to go! I'll add a second example on recurrence microstates missing patterns later. Regarding your comment on the display, I'll have a look at it as well.

Comment thread src/RecurrenceMicrostatesAnalysis.jl Outdated
Comment thread docs/src/dev.md
positions of the `AbstractGPUVector` are accessed (`i` for `x`, and `j` for `y`), and `n` is the number
of dimensions of the system.
3. Add a docstring to your metric describing it.
4. Add your metric to `docs/src/api.md`.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
4. Add your metric to `docs/src/api.md`.
using RecurrenceMicrostatesAnalysis
using Distances: Euclidean

Comment thread docs/src/index.md
```@docs
RecurrenceMicrostatesAnalysis
```

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have moved all packages to display as the starting information to the docs their own README.md file. E>g., have a look at the Attractors.jl docs to see how this is done (where the module Attractors is defined).

In the readme you should also have the citation request info.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is the first issue I mentioned. The README.md apparently contains a broken link, which causes an error when Documenter.jl checks links. I removed it temporarily to allow the docs to build.

I think I fixed it in this PR, but I’ll check it again and standardize the README following the other packages.

Comment thread docs/src/index.md
Finally, developers interested in contributing to RecurrenceMicrostatesAnalysis.jl are encouraged to read the [RecurrenceMicrostatesAnalysis.jl for Devs](@ref) section.

## Input data for RecurrenceMicrostatesAnalysis.jl
### Input data for RecurrenceMicrostatesAnalysis.jl
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The inpout and output sections should be in the API page.

Comment thread docs/src/tutorial.jl

# ## Crash-course into RMA

# Recurrence Plots (RPs) were introduced in 1987 by Eckmann et al.
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this sentence should be removed.

Comment thread docs/src/tutorial.jl

# ![Image of four RPs with their timeseries](assets/rps.png)

# A recurrence microstate is a local structure extracted from an RP. For a given microstate
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# A recurrence microstate is a local structure extracted from an RP. For a given microstate
# A recurrence microstate is a local structure extracted from a recurrence matrix. For a given microstate

Comment thread docs/src/tutorial.jl

# Notice that `X` is already a [`StateSpaceSet`](@ref). Because **RecurrenceMicrostatesAnalysis.jl**
# is part of **DynamicalSystems.jl**, this data type is the preferred input type.
# Other types are also possible as we described in [Input data for RecurrenceMicrostatesAnalysis.jl](@ref).
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend to not refer to section headers by name, but rather use an id. Documenter.jl allows you to encapsulate a section header in a [text](@id id_name) specifier, and later refer to it as [hyperlinked](@ref id_name). This is more robust.

Comment thread docs/src/gpu.md
@@ -0,0 +1,94 @@
# GPU
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this section should become a final subsection of the main tutorial. It feels a somewhat central part of this package that it allows this flexibility of computing over different architectures so it should be in the main tutorial as one of the "key features". This also allows you to re-use the data generated in the main tutorial, reducing the overall length here.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I’ll do it, but we cannot run the code in this section in Actions because it requires a GPU >.<

I tested it using Metal.jl on my PC, and I can also test it using CUDA on a machine in the physics department here.

Of course, I tested all of this code locally before writing it here 🙂 (I'll check again using CUDA later)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's okay if it cannot run in Actions, you can write a julia code snippet by doing writing markdown julia snippet, like # ```julia.

Comment thread docs/src/gpu.md
complexity(WindowedDisorder(W, N; metric = GPUEuclidean()), X_gpu)
```

!!! info "Performance"
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha, this is great but why not prove it? You should paste here a runnable code snippet that you run on your local machine and paste the output.

Comment thread docs/src/examples.jl
# In this section, we provide some examples where it is possible
# to apply RMA to analyze data.

# ## Classifying data with a multi-layer perceptron and RMA
Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This example needs a bit more clarification.

  1. What are you classifying? What are you trying to achieve? You are trying to classify the data into what? Into e.g., chaotic and non chaotic regimes? Below you state "classify them based on a parameter used to generate them", but what does this mean? You have 5 parameters and you are hoping to generate 5 classes?

@Datseris
Copy link
Copy Markdown
Member Author

@gabriel-ferr , here is the answer to your pretty printing question. You have to add a new method to the function ComplexityMeasures.relevant_fieldnames or hidefields, see:

https://github.com/JuliaDynamics/ComplexityMeasures.jl/blob/4d479aae63a98adb01d197438111a44a1349bb83/src/core/pretty_printing.jl#L49-L68

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants