Skip to content

DSE Statistics & Fixes#1586

Open
agusaldasoro wants to merge 4 commits into
masterfrom
dse/memoization
Open

DSE Statistics & Fixes#1586
agusaldasoro wants to merge 4 commits into
masterfrom
dse/memoization

Conversation

@agusaldasoro

@agusaldasoro agusaldasoro commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

Summary

  • Adds memoization to the Z3 SMT solver to skip redundant Docker executions for identical SQL queries
  • Introduces Z3Result (SAT/UNSAT/ERROR) to replace Optional<Map<String, SMTLibValue>>, making solver outcomes explicit and preventing transient errors from being cached
  • Adds optional DSE statistics (collectDseStats flag) tracking SAT/UNSAT/error counts, timing, SMT-LIB size, and query uniqueness, written to CSV only when enabled
  • Adds measureDseCorrectness flag: after Z3 returns SAT, generated rows are evaluated with SqlHeuristicsCalculator to measure their distance to the original SQL predicate. Non-zero distances are logged as warnings and recorded in Statistics
  • Fixes composite primary key handling in SmtLibGenerator: previously every PK column had to differ individually across rows, which over-constrained the encoding. The generated SMT-LIB now enforces tuple uniqueness instead
  • Extends SmtLibGenerator to support DELETE and UPDATE statements in addition to SELECT
  • Fixes LongValue gene mapping by using LongGene instead of truncating values with .toInt()
  • Improves SQL parsing in JSqlVisitor and adds new SmtLibGeneratorTest coverage

Motivation

This PR improves both the correctness and efficiency of EvoMaster's DSE pipeline.

On the performance side, repeated SQL constraints previously caused redundant Docker/Z3 executions. Memoizing SAT/UNSAT results significantly reduces solver overhead while intentionally avoiding caching transient execution errors. The new optional DSE statistics provide detailed observability into solver behavior with virtually no overhead when disabled.

On the correctness side, generated database rows could satisfy the SMT-LIB encoding while failing the original SQL predicate due to encoding limitations, including composite primary key handling, unsupported SQL statement types, and integer truncation. This PR fixes the most common encoding issues and introduces an optional post-verification step that measures how closely generated rows satisfy the original SQL predicate, exposing any remaining mismatches as metrics.

New Stats

DSE Performance Statistics (collectDseStats)

CSV column Description
dseTotalQueries Total solver invocations (SAT + UNSAT + errors + parse failures)
dseUniqueQueries Distinct queries seen (by hash)
dseDuplicateQueries Invocations served from the memoization cache
dseSat Queries solved as SAT
dseUnsat Queries solved as UNSAT
dseErrors Queries that returned a Z3/Docker error
dseParseFailures Queries that failed SQL → SMT-LIB parsing
dseZ3TotalMs Total wall time spent in Z3 (ms)
dseSmtlibGenTotalMs Total wall time generating SMT-LIB (ms)
dseAvgSmtlibSizeBytes Average SMT-LIB file size (bytes)

DSE Correctness Statistics (measureDseCorrectness)

CSV column Description
dseCorrectnessChecks Number of SAT results that were post-verified
dseCorrectnessZeroDistance Checks where generated rows fully satisfied the original SQL predicate
dseCorrectnessNonZero Checks where generated rows had non-zero distance to the original SQL predicate
dseCorrectnessAvgDist Average correctness distance across successful evaluations
dseCorrectnessEvalFailures Checks where correctness evaluation failed

@agusaldasoro agusaldasoro changed the title Dse/memoization Add DSE Metrics + Memoization Jun 16, 2026
@agusaldasoro agusaldasoro requested a review from jgaleotti June 16, 2026 01:45
@agusaldasoro agusaldasoro changed the title Add DSE Metrics + Memoization DSE Memoization + Statistics Jun 18, 2026
@agusaldasoro agusaldasoro marked this pull request as ready for review June 18, 2026 11:52
@jgaleotti

Copy link
Copy Markdown
Collaborator

Is this ready for review? It seems to be failing...

@agusaldasoro

agusaldasoro commented Jun 24, 2026

Copy link
Copy Markdown
Collaborator Author

Is this ready for review? It seems to be failing...

@jgaleotti The changes are ready and working, but there seems to be an out of memory issue that I can not solve in the Core and CoreIT tests

@agusaldasoro agusaldasoro marked this pull request as draft June 25, 2026 00:37
@agusaldasoro agusaldasoro removed the request for review from jgaleotti June 25, 2026 00:37
@agusaldasoro agusaldasoro changed the title DSE Memoization + Statistics DSE Statistics & Fixes Jun 27, 2026
@agusaldasoro agusaldasoro marked this pull request as ready for review June 27, 2026 14:18
@agusaldasoro agusaldasoro requested a review from jgaleotti June 29, 2026 00:55
@agusaldasoro

Copy link
Copy Markdown
Collaborator Author

@jgaleotti Hey! It's ready for review now, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants