Skip to content

fix(database): cannot overwrite view bug#1464

Open
ChrisPdgn wants to merge 7 commits into
mainfrom
overwrite-view-bug
Open

fix(database): cannot overwrite view bug#1464
ChrisPdgn wants to merge 7 commits into
mainfrom
overwrite-view-bug

Conversation

@ChrisPdgn

@ChrisPdgn ChrisPdgn commented May 5, 2026

Copy link
Copy Markdown
Contributor

Problem

Concurrent calls to createView with the same viewName caused intermittent failures:

[ERROR] /database.DatabaseProvider/createView INTERNAL: Cannot overwrite … model once compiled.

This is a check-then-act race. Two or more callers both see this.views[viewName] as unset, then both try to register the same Mongoose model and create the same MongoDB view. The second caller fails because:

  • mongoose.model(name, schema) throws OverwriteModelError if the name is already registered.
  • createCollection({ viewOn }) fails with NamespaceExists if the view already exists in MongoDB.

Typical triggers:

  • Concurrent CreateResourceAccessList calls that resolve to the same hashed viewName.
  • Multiple database instances handling database:create:view bus events simultaneously.
  • recoverViewsFromDatabase running many createView calls in parallel.

The original code also had a second, subtler race in metadata insertion: a check-then-create pattern (findOne -> create) that could produce duplicate Views documents.

Solution

The fix introduces three complementary layers of concurrency protection and separates the tightly coupled creation logic into distinct steps.

1. In-process promise coalescing (pendingViewCreations)

A Map<string, Promise<void>> tracks in-flight work per viewName. When a second caller within the same Node.js process requests the same view while creation is already running, it awaits the existing promise instead of starting duplicate work. After the promise resolves, the caller re-checks local state before proceeding; it does not blindly assume the first caller succeeded.

2. Cross-instance distributed locking via Redlock

For multi-replica deployments (multiple database module instances sharing the same MongoDB and Redis), a Redlock-based distributed lock is acquired per view name.

Two methods were added to StateManager:

  • tryAcquireLock(resource, ttl): attempts to acquire the lock exactly once (retryCount: 0). Returns the lock on success, or null on contention. Only swallows true lock contention; Redis/infrastructure failures are re-thrown.
  • withLock(resource, ttl, fn): acquires the lock, executes fn, and releases in a finally block.

The isLockContention helper inspects Redlock's ExecutionError.attempts to distinguish true lock contention (all vote-against errors are ResourceLockedError) from infrastructure failures (for example, Redis unavailable), ensuring real errors surface instead of being treated as normal contention.

The runCreateViewLocked flow is:

  1. Try to be the writer: call tryAcquireLock. If acquired, run the creation logic and release.
  2. If contended: don't queue behind the lock. Instead, poll for readiness.
  3. If readiness times out: fall back to a blocking withLock call that waits for the lock and then creates the view.

3. Readiness polling for lock losers (waitForViewReadiness)

Instead of every lock loser serially queuing to acquire the same lock, they poll for the view's existence in parallel, checking every 50ms for up to 5 seconds:

  1. Is the view already in the local this.views cache with a matching query?
  2. Does the Views metadata document exist in MongoDB (read from primary) with the correct query?
  3. Does the physical MongoDB view namespace exist?
  4. Does the MongoDB view definition (viewOn + pipeline) match what we expect?

If all checks pass, the lock loser registers the Mongoose model locally via buildViewModel and returns without acquiring the lock. If something doesn't match, such as a query mismatch or a namespace that exists but is not a view, a descriptive GrpcError is thrown rather than silently producing corrupt state.

4. Separation of creation concerns

The old code mixed Mongoose model registration with MongoDB view creation in one path. The new code separates this into focused private methods:

  • buildViewModel: registers the Mongoose model in memory (this.views[viewName]) and clears stale models first.
  • createPhysicalMongoView: calls db.createCollection(viewName, { viewOn, pipeline }).
  • upsertViewMetadata: uses updateOne with upsert: true, replacing the old race-prone check-then-create.
  • createViewLocked: orchestrates the full locked creation path with state checks.

Summary of changes

  • Same-process callers now coalesce on one in-flight promise per viewName.
  • Cross-instance callers coordinate through a Redis-backed Redlock per view name.
  • Lock losers verify readiness in parallel before falling back to blocking lock acquisition.
  • MongoDB view creation, local Mongoose model registration, and metadata persistence are handled separately.
  • View metadata is upserted instead of inserted through a check-then-create pattern.
  • Redlock contention is distinguished from Redis/infrastructure failure.

Testing

Verified with two database instances deployed concurrently. The OverwriteModelError no longer occurs under the previously failing scenario.


What kind of change does this PR introduce?

  • Bugfix
  • Feature
  • Code style update
  • Refactor
  • Build-related changes
  • Other (please describe)

Does this PR introduce a breaking change?

  • Yes
  • No

The PR fulfills these requirements:

  • It's submitted to the main branch
  • When resolving a specific issue, it's referenced in the PR's description (e.g. fix #xxx, where "xxx" is the issue number)

@ChrisPdgn ChrisPdgn marked this pull request as draft May 6, 2026 13:04
ChrisPdgn and others added 5 commits June 26, 2026 16:13
Avoid serial Redlock handoff after the writer finishes by trying a short
lock acquire first, then polling durable view state before falling back
to the exclusive writer path.

Co-authored-by: Cursor <cursoragent@cursor.com>
@ChrisPdgn ChrisPdgn force-pushed the overwrite-view-bug branch from a5ea081 to b98d4ae Compare June 26, 2026 13:14
ChrisPdgn and others added 2 commits June 26, 2026 16:39
Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
@ChrisPdgn ChrisPdgn marked this pull request as ready for review June 26, 2026 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants