Skip to content

Universal table transformer combining univariate transformations dispatched on schema  #49

Description

@ablaom

It has been proposed on Slack that it be possible to have a single table transformer that transforms individual columns according to user-specified univariate transformations. This sounds like a good idea, which would also force some uniformity that's a little bit lacking in the current collection of table transformers.

  1. In the most general case I can imagine implementing, the univariate transformer that applies to a particular column is defined by a function that operates on both the name and scitype of the the column (as encoded in the table schema). This has the disadvantage that the user must specify a function with two arguments - or interact through some other complicated interface.

  2. The alternative would be a compositional approach. Each tabular transformer only carries out a single univariate transformer, applying to all specified names and scitypes (or "not"-names and "not"-scitypes, through ignore Boolean parameter), which would cover all conceivable use-cases. (columns not referred to are left alone). However, as we are currently locked into Tables.jl (which are non-mutable in general) we get a lot more copying of data.

Thoughts anyone?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions