GRA-1144: add opt-in SDK WAU telemetry by Gradata · Pull Request #251 · Gradata/gradata

Gradata · 2026-06-03T10:30:35Z

Implements the SDK/CLI side of GRA-1144 WAU telemetry.

sends anonymous opt-in wau_ping on agent session start
keeps telemetry off by default; existing config prompt/kill-switch still applies
adds gradata telemetry wau readback command
documents opt-in and privacy disclosure in README

Verification:

python3 -m pytest tests/test_telemetry.py -q → 39 passed

greptile-apps

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

coderabbitai · 2026-06-03T10:30:46Z

📝 Walkthrough

WAU Telemetry Implementation: Adds opt-in weekly active user (wau_ping) telemetry that sends anonymous usage data when an agent session starts; disabled by default
New CLI Command: gradata telemetry wau provides a readback command to fetch and display live WAU metrics
Session Hook Integration: Session boot hook automatically triggers send_session_ping() to record WAU events at session start with graceful error handling (debug logs only)
New Public API Methods:
- send_session_ping(*, blocking: bool = False) – sends opt-in WAU ping (best-effort, background thread by default)
- fetch_wau(timeout: float = 3.0) – fetches WAU aggregate data from telemetry endpoint
Telemetry Event Types: Extends telemetry system with HEARTBEAT_EVENTS constant and TelemetryEventName type covering activation and heartbeat events
Enhanced HTTP Routing: _post() method now accepts optional endpoint override for routing different event types to appropriate endpoints
Privacy Documentation: README updated with opt-in behavior, data categories sent, explicit privacy disclosures (no code, file paths, prompts, emails, names, stack traces, or raw IPs), and how to disable via GRADATA_TELEMETRY=0
Event Validation Update: send_event() validation extended from activation events only to all telemetry events (including wau_ping)
Test Coverage: 39 tests passing, including new tests for session ping posting and WAU fetch error handling

Walkthrough

This PR introduces session heartbeat telemetry alongside existing activation events. It defines heartbeat event types, extends HTTP posting to support endpoint overrides, implements session ping and WAU fetch utilities, integrates session pings at startup, adds a CLI command to view WAU metrics, and documents the telemetry privacy model.

Changes

Session Heartbeat Telemetry

Layer / File(s)	Summary
Telemetry event types and core utilities `Gradata/src/gradata/_telemetry.py`	Defines `HEARTBEAT_EVENTS` (`wau_ping`) and combines activation + heartbeat into `TELEMETRY_EVENTS`; adds `TelemetryEventName` type; extends `_post` to accept optional endpoint override; updates `send_event` validation for combined event set; implements `_ping_endpoint`, `send_session_ping`, and `fetch_wau` (with fallback error dict on failures).
Session ping initialization `Gradata/src/gradata/hooks/session_boot.py`	Calls `send_session_ping()` at session startup, wrapped in try/except to log debug on failure without blocking session continuation.
CLI telemetry visibility command `Gradata/src/gradata/cli.py`	Adds `cmd_telemetry(args)` handler to fetch and display WAU data; supports `--json` or formatted output; integrates `telemetry` subcommand with `wau` subcommand into argparse and command dispatch.
Telemetry tests `Gradata/tests/test_telemetry.py`	Validates `send_session_ping()` posts `wau_ping` when enabled and is a no-op when disabled; verifies `_ping_endpoint()` derives correct URL; confirms `fetch_wau()` returns fallback error dict on unreachable endpoints.
Privacy and telemetry documentation `Gradata/README.md`	Documents opt-in telemetry (default-off), consent flow, stored config path, allowed fields (event name, hashed user_id, UTC timestamp, SDK version), explicit exclusions (code, file paths, content, prompts, emails, names, stack traces, env vars, IPs), disable via `GRADATA_TELEMETRY=0`, and example `wau_ping` command.

🎯 3 (Moderate) | ⏱️ ~20 minutes

feature, docs

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Title check	✅ Passed	The pull request title 'GRA-1144: add opt-in SDK WAU telemetry' clearly and concisely summarizes the main change: adding WAU telemetry functionality to the SDK with opt-in behavior.
Description check	✅ Passed	The pull request description is directly related to the changeset, detailing the implementation of WAU telemetry including session pings, opt-in/default-off behavior, CLI command, README documentation, and test verification.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch gra-1144-wau-telemetry

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 OpenGrep (1.22.0)

OpenGrep fatal error (exit code 2):
┌──────────────┐
│ Opengrep CLI │
└──────────────┘

�[32m✔�[39m �[1mOpengrep OSS�[0m
�[32m✔�[39m Basic security coverage for first-party code vulnerabilities.

�[1m Loading rules from local config...�[0m
[00.22][ERROR]: Error: exception Glob.Lexer.Syntax_error("malformed glob pattern: missing ']'")
Raised at Glob__Lexer.syntax_error in file "libs/glob/Lexer.mll", line 8, characters 2-26
Called from Glob__Lexer.__ocaml_lex_token_rec in file "libs/glob/Lexer.mll", line 29, characters 26-53
Cal

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@Gradata/README.md`:
- Around line 152-156: Update the README paragraph that begins "It never sends
code, file paths, lesson/correction text, prompts, emails, names, stack traces,
environment variables, or raw IP addresses." to remove the absolute claim about
raw IPs and clarify that while the SDK payload does not include IP fields, the
telemetry service and network infrastructure can still observe the source IP at
transport time; keep the note about GRADATA_TELEMETRY=0 and the wau_ping
behavior intact. Reference the exact sentence starting with "It never sends ..."
when locating the text to edit and replace that clause about raw IPs with the
revised, accurate wording about transport-level visibility.

In `@Gradata/src/gradata/cli.py`:
- Around line 135-150: Add deterministic unit tests that exercise the CLI wiring
for the new telemetry command: create tests (e.g. tests/test_cli_telemetry.py)
that invoke gradata.cli.cmd_telemetry via argparse-like invocation or by calling
the function with a fake args namespace to assert behavior of cmd_telemetry;
include cases for success JSON output (args.json=True) and human-readable output
(args.json=False) using a mocked gradata._telemetry.fetch_wau to return
predictable dicts (including a dict with an "error" key to test the error/status
path), and verify printed stdout for WAU, Week start, and Status messages as
well as JSON formatting. Ensure tests avoid nondeterministic calls and
patch/monkeypatch gradata._telemetry.fetch_wau so CI is deterministic.

In `@Gradata/tests/test_telemetry.py`:
- Around line 139-146: Update the test_session_ping_posts_wau_to_ping_endpoint
test to assert that send_session_ping posts to the derived ping URL by checking
the first arg passed into the mocked _post call equals
_telemetry._ping_endpoint(); specifically, after calling
_telemetry.send_session_ping(blocking=True) and before asserting event payload,
add an assertion that post.call_args[0][0]["url"] (or the correct key used for
the request URL in the _post payload) == _telemetry._ping_endpoint() so the test
verifies send_session_ping() uses _telemetry._ping_endpoint() rather than the
generic event endpoint.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 78023ecc-3aaa-47aa-a55a-81d5b227b881

📥 Commits

Reviewing files that changed from the base of the PR and between 0f1513c and af059d1.

📒 Files selected for processing (5)

Gradata/README.md
Gradata/src/gradata/_telemetry.py
Gradata/src/gradata/cli.py
Gradata/src/gradata/hooks/session_boot.py
Gradata/tests/test_telemetry.py

📜 Review details

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)

GitHub Check: pytest (py3.12)
GitHub Check: pytest (py3.11)
GitHub Check: pytest ubuntu-latest / py3.12
GitHub Check: pytest windows-latest / py3.12
GitHub Check: pytest macos-latest / py3.11
GitHub Check: pytest macos-latest / py3.12
GitHub Check: pytest windows-latest / py3.11
GitHub Check: pytest ubuntu-latest / py3.11

🧰 Additional context used

📓 Path-based instructions (2)

Gradata/src/**/*.py

📄 CodeRabbit inference engine (Gradata/AGENTS.md)

Gradata/src/**/*.py: Prefer sentence-transformers for local embeddings, google-genai for Gemini embeddings, cryptography for AES-GCM encrypted system.db, bm25s for BM25 rule ranking, and mem0ai for external memory adapters — guard all optional dependency imports with try / except ImportError at the call site, never at module level
Maintain strict layering: Layer 0 (Primitives: _types.py, _db.py, _events.py, _paths.py, _file_lock.py; Patterns: contrib/patterns/) must never import from Layer 1 (Enhancements: enhancements/, rules/) or Layer 2 (Public API: brain.py, cli.py, daemon.py, mcp_server.py)
Never use bare except: pass — use typed exceptions or at minimum logger.warning(...) with exc_info=True to avoid silent failure in a memory product
Never import from out-of-scope sibling directories ../Sprites/ or ../Hausgem/ within gradata/* code — that is a layering bug
Never leak private-sibling paths into public docs/code — no references to ../Sprites/, ../Hausgem/, email addresses, OneDrive paths, or Sprites-specific examples from inside gradata/*
Use atomic-write helper when writing JSON files to prevent corruption from mid-write crashes

Files:

Gradata/src/gradata/hooks/session_boot.py
Gradata/src/gradata/cli.py
Gradata/src/gradata/_telemetry.py

Gradata/tests/**/*.py

📄 CodeRabbit inference engine (Gradata/AGENTS.md)

Gradata/tests/**/*.py: Set BRAIN_DIR environment variable via tmp_path in conftest.py for test isolation — ensure _paths.py module cache refreshes when calling Brain.init() directly inside tests
Add unit tests in tests/test_*.py for every CI push without LLM calls (deterministic); mark integration tests with @pytest.mark.integration and skip them by default (they hit real LLM APIs)

Files:

Gradata/tests/test_telemetry.py

🧠 Learnings (2)

📚 Learning: 2026-05-01T15:50:32.772Z

Learnt from: CR
Repo: Gradata/gradata PR: 0
File: Gradata/AGENTS.md:0-0
Timestamp: 2026-05-01T15:50:32.772Z
Learning: Use Python 3.11+ — distribute to PyPI as `gradata` under Apache-2.0 license with architecture: Local-first SQLite + JSONL event log, optional cloud sync

Applied to files:

Gradata/README.md

📚 Learning: 2026-05-01T15:50:32.772Z

Learnt from: CR
Repo: Gradata/gradata PR: 0
File: Gradata/AGENTS.md:0-0
Timestamp: 2026-05-01T15:50:32.772Z
Learning: Use `from gradata import Brain` as the public entry point — `brain.correct()` is THE entry point for the headline product promise

Applied to files:

Gradata/README.md

coderabbitai · 2026-06-03T10:35:56Z

+It never sends code, file paths, lesson/correction text, prompts, emails, names,
+stack traces, environment variables, or raw IP addresses. Set
+`GRADATA_TELEMETRY=0` to disable telemetry for any session, even if you opted in.
+For dogfood metrics, `wau_ping` fires once on each agent session start and powers
+weekly active user reporting:


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Do not claim the SDK never sends raw IP addresses.

The payload excludes IP fields, but the telemetry service and normal network infrastructure still see the request source IP at transport time. As written, this privacy disclosure is inaccurate.

Suggested rewording

-It never sends code, file paths, lesson/correction text, prompts, emails, names, -stack traces, environment variables, or raw IP addresses. Set +The SDK payload never includes code, file paths, lesson/correction text, prompts, +emails, names, stack traces, environment variables, or IP address fields. Like +any HTTPS request, the telemetry service and standard network infrastructure may +still process the source IP at transport time. Set `GRADATA_TELEMETRY=0` to disable telemetry for any session, even if you opted in.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

It never sends code, file paths, lesson/correction text, prompts, emails, names,

stack traces, environment variables, or raw IP addresses. Set

`GRADATA_TELEMETRY=0` to disable telemetry for any session, even if you opted in.

For dogfood metrics, `wau_ping` fires once on each agent session start and powers

weekly active user reporting:

The SDK payload never includes code, file paths, lesson/correction text, prompts,

emails, names, stack traces, environment variables, or IP address fields. Like

any HTTPS request, the telemetry service and standard network infrastructure may

still process the source IP at transport time. Set

`GRADATA_TELEMETRY=0` to disable telemetry for any session, even if you opted in.

For dogfood metrics, `wau_ping` fires once on each agent session start and powers

weekly active user reporting:

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Gradata/README.md` around lines 152 - 156, Update the README paragraph that begins "It never sends code, file paths, lesson/correction text, prompts, emails, names, stack traces, environment variables, or raw IP addresses." to remove the absolute claim about raw IPs and clarify that while the SDK payload does not include IP fields, the telemetry service and network infrastructure can still observe the source IP at transport time; keep the note about GRADATA_TELEMETRY=0 and the wau_ping behavior intact. Reference the exact sentence starting with "It never sends ..." when locating the text to edit and replace that clause about raw IPs with the revised, accurate wording about transport-level visibility.

coderabbitai · 2026-06-03T10:35:56Z

+def cmd_telemetry(args):
+    """Telemetry visibility commands."""
+    from gradata import _telemetry
+
+    if args.telemetry_cmd == "wau":
+        data = _telemetry.fetch_wau()
+        if args.json:
+            print(json.dumps(data, indent=2, sort_keys=True))
+            return
+        print(f"WAU: {data.get('wau', 0)}")
+        if data.get("week_start"):
+            print(f"Week start: {data['week_start']}")
+        if data.get("error"):
+            print(f"Status: {data['error']}")
+        return
+    raise SystemExit("unknown telemetry command")


🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win

Add deterministic CLI coverage for gradata telemetry wau.

This introduces a new user-facing command, but the provided tests only cover _telemetry.fetch_wau(). Please add CLI-level tests for argparse wiring, --json, and the human-readable error/success output paths.

As per coding guidelines, "Add unit tests in tests/test_*.py for every CI push without LLM calls (deterministic)".

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Gradata/src/gradata/cli.py` around lines 135 - 150, Add deterministic unit tests that exercise the CLI wiring for the new telemetry command: create tests (e.g. tests/test_cli_telemetry.py) that invoke gradata.cli.cmd_telemetry via argparse-like invocation or by calling the function with a fake args namespace to assert behavior of cmd_telemetry; include cases for success JSON output (args.json=True) and human-readable output (args.json=False) using a mocked gradata._telemetry.fetch_wau to return predictable dicts (including a dict with an "error" key to test the error/status path), and verify printed stdout for WAU, Week start, and Status messages as well as JSON formatting. Ensure tests avoid nondeterministic calls and patch/monkeypatch gradata._telemetry.fetch_wau so CI is deterministic.

coderabbitai · 2026-06-03T10:35:56Z

+    def test_session_ping_posts_wau_to_ping_endpoint(self, monkeypatch):
+        _telemetry.set_enabled(True)
+        monkeypatch.setenv(_telemetry.ENV_ENDPOINT, "https://api.example.com/telemetry/event")
+        with patch.object(_telemetry, "_post", return_value=True) as post:
+            _telemetry.send_session_ping(blocking=True)
+            post.assert_called_once()
+            assert post.call_args[0][0]["event"] == "wau_ping"
+        assert _telemetry._ping_endpoint() == "https://api.example.com/telemetry/ping"


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Assert the derived endpoint on the _post call.

This currently proves _ping_endpoint() computes the right URL, but not that send_session_ping() actually uses it. A regression that posts wau_ping to /telemetry/event would still pass here.

Suggested assertion

with patch.object(_telemetry, "_post", return_value=True) as post: _telemetry.send_session_ping(blocking=True) post.assert_called_once() assert post.call_args[0][0]["event"] == "wau_ping" + assert post.call_args.kwargs["endpoint"] == "https://api.example.com/telemetry/ping" assert _telemetry._ping_endpoint() == "https://api.example.com/telemetry/ping"

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@Gradata/tests/test_telemetry.py` around lines 139 - 146, Update the test_session_ping_posts_wau_to_ping_endpoint test to assert that send_session_ping posts to the derived ping URL by checking the first arg passed into the mocked _post call equals _telemetry._ping_endpoint(); specifically, after calling _telemetry.send_session_ping(blocking=True) and before asserting event payload, add an assertion that post.call_args[0][0]["url"] (or the correct key used for the request URL in the _post payload) == _telemetry._ping_endpoint() so the test verifies send_session_ping() uses _telemetry._ping_endpoint() rather than the generic event endpoint.

feat: add opt-in SDK WAU telemetry

af059d1

greptile-apps Bot reviewed Jun 3, 2026

View reviewed changes

coderabbitai Bot added docs feature labels Jun 3, 2026

coderabbitai Bot requested changes Jun 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GRA-1144: add opt-in SDK WAU telemetry#251

GRA-1144: add opt-in SDK WAU telemetry#251
Gradata wants to merge 1 commit into
mainfrom
gra-1144-wau-telemetry

Gradata commented Jun 3, 2026

Uh oh!

greptile-apps Bot left a comment

Uh oh!

coderabbitai Bot commented Jun 3, 2026 •

edited

Loading

Walkthrough

Changes

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 3, 2026

Uh oh!

coderabbitai Bot Jun 3, 2026

Uh oh!

coderabbitai Bot Jun 3, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Gradata commented Jun 3, 2026

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 3, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented Jun 3, 2026 •

edited

Loading