Skip to content

SQL parsing/formatting overhead #47

Description

@davenquinn

We run sqlparse and format utilities on each statement. This apparently adds a lot of per-statement overhead, as was discovered by some profiling tasks.

macrostrat.database runs sqlparse.format + sqlparse.split on every run_query (line 136, canonicalize_query) — and again in _render_query_text. sqlparse tokenizes the SQL character-by-character, which is the 19M re.match.

sqlparse is ~85% of the update time

This is only needed for multi-statement SQL and there are faster parsers available. We could investigate those and also create a 'fast path' for queries without semicolons.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions