fix(wiki): eliminate duplicate-page bugs (path normalization, move_page, FTS triggers)#36
Merged
Conversation
CreatePage/UpdatePage/DeletePage/GetPage/ListPages stored the raw
input string as the SQLite primary key while resolving the file on
disk via filepath.Join, which normalizes the path. Equivalent
denormalized spellings ("/projects/foo", "projects/foo",
"projects//foo", "projects/foo.md", etc.) thus produced two index
rows for the same on-disk file, surfacing as duplicates in search
and list results until the next process restart.
Add normalizePagePath() and apply it at every public entry point.
It rejects empty, ".", "/", and any path that escapes the wiki
root via "..".
Fixes #35
Adds Wiki.MovePage and a corresponding move_page MCP tool. The
operation:
- Normalizes both source and destination paths.
- Refuses if normalized paths are equal, source is missing, or
destination already exists.
- Acquires per-page locks in sorted order to avoid deadlocks
against concurrent moves.
- Renames the file on disk, then refreshes the index: removes
the old row (which drops the page's outgoing-link rows) and
re-indexes under the new path.
- Leaves backlinks (other pages' [[wikilink]] text pointing to
the old name) untouched — those rows reflect what is actually
in the source markdown.
This replaces the common create_page + delete_page workaround
pattern that was leaving duplicate pages behind when agents
forgot the second step.
Refs #35
SQLite fires AFTER DELETE triggers on the implicit row removal inside
INSERT OR REPLACE only when PRAGMA recursive_triggers is ON. With the
default OFF, every indexPage / Reindex write left an orphan docid in
pages_fts pointing at the old revision. The Search() JOIN masked it,
but the FTS index grew unbounded and queries that bypass the JOIN
(probed directly: SELECT rowid FROM pages_fts WHERE pages_fts MATCH
'old-token') returned ghost hits.
- Set _pragma=recursive_triggers(1) in the DSN so it applies to
every pooled connection.
- Run an FTS rebuild once on Open() to purge orphans left behind by
previous versions.
Adds a regression test that probes pages_fts directly for the old
token.
Refs #35
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #35.
Three related fixes for the duplicate-page class of bugs.
1. Path normalization (
2d640d4)CreatePage/UpdatePage/DeletePage/GetPage/ListPagesused the raw input string as the SQLite primary key while resolving the file viafilepath.Join, which normalizes. Equivalent denormalized spellings (/projects/foo,projects/foo,projects//foo,projects/foo.md, etc.) produced two index rows for the same on-disk file — surfacing as duplicates in search and list until the next process restart cleaned them up viaReindexPhase 4.Adds
normalizePagePath()and applies it at every public entry point. Rejects empty,.,/,.., and any../traversal.2.
move_pagetool (cd6cfeb)Adds
Wiki.MovePageand the correspondingmove_pageMCP tool, so agents can atomically rename/relocate a page instead of doingcreate_pageat the new path + (often forgotten)delete_pageat the old path — the pattern that was leaving duplicate files on disk.fromandto.os.Renames the file, then refreshes the index (drops the old row + outgoing-link rows; re-indexes under the new path).links.target = old_path) are intentionally not rewritten — those rows reflect[[wikilink]]text in other pages' markdown that still says the old name. Rewriting would make the index lie.3. FTS recursive_triggers + rebuild (
60f35ad)SQLite fires
AFTER DELETEtriggers on the implicit row removal insideINSERT OR REPLACEonly whenPRAGMA recursive_triggers = ON. With the default OFF, everyindexPage/Reindexwrite left an orphan docid inpages_ftspointing at the old revision.Search's JOIN masked it, but the FTS index grew unbounded, and a direct probe demonstrates the leak:Sets
_pragma=recursive_triggers(1)in the DSN (applies to every pooled connection) and runs anINSERT INTO pages_fts(pages_fts) VALUES('rebuild')once onOpen()to purge orphans left behind by prior versions.Backward compatibility
None of the three changes is breaking for existing
.mind-map.dbfiles. No schema changes. Denormalized rows are removed by the existingReindexPhase 4 on first startup; FTS orphans are removed by the one-shot rebuild. Tables, columns, triggers, and FTS virtual table are unchanged.Tests
Full
go test ./...✅, including new regression tests:TestNormalizePagePathTestDuplicateIndexRowsViaDenormalizedPathsTestRejectPathEscapingWikiRootTestMovePage/…NormalizesPaths/…FailsWhenDestinationExists/…FailsWhenSourceMissing/…FailsWhenSamePathTestMovePageTestFTSDoesNotLeakOrphansOnUpdate(probespages_ftsdirectly to bypass the masking JOIN)