Skip to content

Telemetry baseline: durable analytics outbox, session tracking, UI-framework marker#987

Open
johnml1135 wants to merge 2 commits into
mainfrom
telemetry-migration-baseline
Open

Telemetry baseline: durable analytics outbox, session tracking, UI-framework marker#987
johnml1135 wants to merge 2 commits into
mainfrom
telemetry-migration-baseline

Conversation

@johnml1135

@johnml1135 johnml1135 commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Summary

Establishes a FieldWorks-only telemetry baseline (openspec change: telemetry-migration-baseline) ahead of the planned Avalonia UI migration, so there's a "legacy WinForms" usage/crash/session baseline to compare against later.

  • AnalyticsOutbox (Src/Common/FwUtils/AnalyticsOutbox.cs): a durable local queue in front of DesktopAnalytics.Analytics.Track/ReportException. Today, an event generated while offline is silently dropped (DesktopAnalytics is fire-and-forget with no retry). The outbox persists each event as its own small JSON file, flushes on enqueue/startup/shutdown, and re-checks consent at flush time (not just enqueue time) so revoking consent stops even already-queued data from going out.
  • All 6 existing Analytics.Track/ReportException call sites are migrated to the new facade (FieldWorks.cs, AreaListener.cs, TrackingHelper.cs, ConcordanceContainer.cs, ConfigureInterlinDialog.cs, ObtainProjectMethod.cs).
  • Session baseline: session-start/session-end events with clean-vs-crashed classification, giving session duration and crash-free-session rate.
  • Usage enrichment: Analytics.SetApplicationProperty("UiFramework", "WinForms") set once at startup (forward-compatible — a future Avalonia surface sets a different value on the same property), plus duration (dwell time) added to the existing throttled SwitchToTool event.

Bug found and fixed during implementation

While writing the concurrency test for the outbox's file-claim logic, I found that System.IO.File.Move does not reliably report failure to the "losing" thread when two callers race an identical rename on .NET Framework — both can return without throwing, even though the OS performs exactly one physical rename. Verified with an isolated repro outside this codebase (~99% reproducible across hundreds of trials; confirmed via on-disk state that only one physical rename ever occurs, so the underlying filesystem operation is correct — the wrapper's exception surfacing for the loser is not). This invalidated the original "atomic rename = exclusive claim" design. Fixed by following the rename with a FileShare.None exclusive open as the real, OS-enforced single-owner check (documented as design.md D14). This was caught by AnalyticsOutboxTests.Flush_ConcurrentFlushCalls_DeliverEachEventExactlyOnce, which failed intermittently (11-20 deliveries instead of 10) before the fix, and now passes deterministically.

Test plan

  • .\build.ps1 — full managed build succeeds, no native rebuild triggered.
  • .\test.ps1 on FwUtilsTests — 390/390 pass, including 15 new AnalyticsOutboxTests (consent gating, FIFO delivery, cap eviction by count/age, claim-race exactly-once delivery, orphan recovery with staleness gating, delivery-failure rollback).
  • .\test.ps1 on LexTextDllTests, FieldWorksTests, ITextDllTests — zero regressions from the call-site migration.
  • Gap: no automated test for session-start/end tracking (FieldWorks.cs) or SwitchToTool dwell-time computation (AreaListener.cs) — both are blocked on cross-assembly access to AnalyticsOutbox's test seams (currently InternalsVisibleTo only covers FwUtilsTests). Documented as a follow-up in tasks.md §3.6/§4.6.
  • Gap: no automated way to verify the UiFramework application property actually attaches to outgoing events — that's internal to the DesktopAnalytics/Mixpanel client. tasks.md §4.2.
  • Manual, not yet performed: end-to-end offline queueing (disconnect network, use FieldWorks, confirm files accumulate under %LocalAppData%\SIL\FieldWorks\Analytics\Outbox\, reconnect, relaunch, confirm the directory empties). tasks.md §5.3.
  • Manual, not yet performed: confirm the Tools > Options > Privacy checkbox (OkToPingBasicUsageData) still gates all telemetry, including already-queued-but-unflushed events. tasks.md §5.4.

🤖 Generated with Claude Code


This change is Reviewable

johnml1135 and others added 2 commits July 2, 2026 11:05
…on baseline, and forward-compatible UI-framework marker

Grounds a prior strategic-review recommendation (accrue a Legacy-UI telemetry
baseline ahead of the Avalonia rollout) against actual code on origin/main via
grill-with-docs. Adds the openspec change (proposal/design/specs/tasks) plus
grounded Analytics/Telemetry terminology and ADR decisions in CONTEXT.md.

No production code yet; this is planning artifacts only.

Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
…sion baseline, usage enrichment

Adds AnalyticsOutbox, a durable local queue in front of DesktopAnalytics
Track/ReportException so events survive being generated while offline
instead of being silently dropped. Migrates all six existing call sites
to the new facade, adds session-start/end tracking with clean-vs-crashed
classification, and sets a UiFramework=WinForms application property as
forward-compatible scaffolding for a future Avalonia UI split.

Found and fixed during implementation: System.IO.File.Move does not
reliably report failure to the losing thread when two callers race an
identical rename on .NET Framework - both can return without throwing
even though the OS performs exactly one physical rename (verified with
an isolated repro, ~99% reproducible). The outbox's claim mechanism now
follows the rename with a FileShare.None exclusive open as the real
single-owner check (design.md D14). Caught by the concurrent-flush unit
test, which failed intermittently before this fix.

390 FwUtilsTests pass (15 new AnalyticsOutboxTests), plus zero
regressions in LexTextDllTests, FieldWorksTests, and ITextDllTests.
tasks.md documents three remaining test gaps (session-baseline and
dwell-time unit tests blocked on cross-assembly test-seam access) and
two manual verification steps (end-to-end offline queueing, privacy
toggle behavior) not yet performed.

Co-Authored-By: Claude Sonnet 5 <noreply@anthropic.com>
@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown

⚠️ Commit Message Format Issues ⚠️
commit d0c2288d2e:
1: T1 Title exceeds max length (100>72): "Implement telemetry-migration-baseline: durable analytics outbox, session baseline, usage enrichment"

commit f92d4d0398:
1: T1 Title exceeds max length (124>72): "Propose telemetry-migration-baseline: durable analytics outbox, session baseline, and forward-compatible UI-framework marker"

@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown

NUnit Tests

    1 files  ± 0      1 suites  ±0   10m 25s ⏱️ -14s
4 314 tests +15  4 241 ✅ +15  73 💤 ±0  0 ❌ ±0 
4 323 runs  +15  4 250 ✅ +15  73 💤 ±0  0 ❌ ±0 

Results for commit d0c2288. ± Comparison against base commit 323a022.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant