Skip to content

docs: add GRA-374 experiment artifact#255

Open
Gradata wants to merge 1 commit into
mainfrom
docs/gra-374-experiment-artifact
Open

docs: add GRA-374 experiment artifact#255
Gradata wants to merge 1 commit into
mainfrom
docs/gra-374-experiment-artifact

Conversation

@Gradata
Copy link
Copy Markdown
Owner

@Gradata Gradata commented Jun 4, 2026

Summary

  • Adds a durable research artifact for GRA-374 multi_cli_install_success_rate
  • Records the baseline, one-measure definition, 7-day decision rule, evaluation result, and GRA-1163 follow-up

Verification

  • git diff --cached --check before commit
  • doc-only change; no runtime tests required

Paperclip: GRA-374

Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Jun 4, 2026

Review Change Stack

📝 Walkthrough

Documentation: GRA-374 Multi-CLI Installer Experiment Artifact

  • Added durable research artifact documenting the GRA-374 experiment evaluating multi-CLI installer success rates across Claude Code, Codex, Cursor, Hermes, and OpenCode
  • Evaluation window: 2026-05-12 to 2026-05-19
  • Success metric: ≥80% (4 of 5 CLIs) passing install + correction-capture smoke test
  • Result: DISCARD verdict — baseline remained at 1/5 (20%) because prerequisite implementation GRA-55 was not merged
  • Findings: Only Claude Code had working bidirectional hook capture; Codex, Hermes, and OpenCode still lacked post-tool/session-end capture paths
  • Follow-up: Issue GRA-1163 filed to re-implement missing hooks with explicit verification gate
  • Documentation-only change (+96 lines); no runtime code modifications or breaking changes

Walkthrough

This PR adds a research experiment documentation file for GRA-374, specifying a 7-day multi-CLI installer evaluation across Claude Code, Codex, Cursor, Hermes, and OpenCode. The document defines the success metric (≥80% passing), baseline (20%), runbook procedures, final results (verdict: DISCARD, 20% outcome), and filed follow-up work.

Changes

GRA-374 Multi-CLI Installer Experiment

Layer / File(s) Summary
Experiment specification and evaluation results
Gradata/docs/research/gra-374-multi-cli-install-success-rate.md
Complete experiment spec with objective, metric definition, baseline, install-and-smoke-test runbook, decision rules, measured outcome (1/5 CLIs passing), and follow-up context on post-tool capture hooks.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~5 minutes

Suggested labels

docs

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and concisely summarizes the main change: adding a documentation artifact for the GRA-374 experiment.
Description check ✅ Passed The description is well-related to the changeset, detailing what artifact is being added, what it documents, and verification steps.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch docs/gra-374-experiment-artifact

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot added the docs label Jun 4, 2026
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@Gradata/docs/research/gra-374-multi-cli-install-success-rate.md`:
- Line 4: The experiment window "Window: 2026-05-12 to 2026-05-19" contradicts
the stated 7‑day duration; update the window string (or nearby sentence) to
either end at 2026-05-18 to be inclusive 7 days or add explicit wording that the
end date is exclusive (e.g., "2026-05-12 to 2026-05-19 (end date exclusive)");
apply the same change wherever the same window appears (the "Window: 2026-05-12
to 2026-05-19" text and any duplicate instance noted as "Also applies to:
10-10").
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 32098443-860d-4e0a-a335-49c9b4a2c3df

📥 Commits

Reviewing files that changed from the base of the PR and between 4dfe596 and b104c2b.

📒 Files selected for processing (1)
  • Gradata/docs/research/gra-374-multi-cli-install-success-rate.md
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (8)
  • GitHub Check: pytest ubuntu-latest / py3.11
  • GitHub Check: pytest windows-latest / py3.12
  • GitHub Check: pytest macos-latest / py3.12
  • GitHub Check: pytest windows-latest / py3.11
  • GitHub Check: pytest macos-latest / py3.11
  • GitHub Check: pytest ubuntu-latest / py3.12
  • GitHub Check: pytest (py3.12)
  • GitHub Check: pytest (py3.11)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2026-04-17T17:18:07.439Z
Learnt from: Gradata
Repo: Gradata/gradata PR: 0
File: :0-0
Timestamp: 2026-04-17T17:18:07.439Z
Learning: In PR `#102` (gradata/gradata), Round 2 addressed: cli.py env-first brain resolution (GRADATA_BRAIN > --brain-dir > cwd), _tenant.py corrupt .tenant_id overwrite, _env_int default clamping to minimum, and _events.py tenant-scoped fallback SELECT for dedup. All ruff and 99 tests green after these fixes.

Applied to files:

  • Gradata/docs/research/gra-374-multi-cli-install-success-rate.md

# GRA-374: multi_cli_install_success_rate experiment

Status: evaluated — DISCARD
Window: 2026-05-12 to 2026-05-19
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Clarify the experiment window to match the stated 7-day duration.

2026-05-12 to 2026-05-19 reads as 8 days (inclusive), which conflicts with “quick 7-day experiment.” Please either adjust one date (e.g., end at 2026-05-18) or explicitly state exclusive-end semantics.

Proposed doc fix
-Window: 2026-05-12 to 2026-05-19
+Window: 2026-05-12 to 2026-05-18

Also applies to: 10-10

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@Gradata/docs/research/gra-374-multi-cli-install-success-rate.md` at line 4,
The experiment window "Window: 2026-05-12 to 2026-05-19" contradicts the stated
7‑day duration; update the window string (or nearby sentence) to either end at
2026-05-18 to be inclusive 7 days or add explicit wording that the end date is
exclusive (e.g., "2026-05-12 to 2026-05-19 (end date exclusive)"); apply the
same change wherever the same window appears (the "Window: 2026-05-12 to
2026-05-19" text and any duplicate instance noted as "Also applies to: 10-10").

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant